FAT File Management System Supplement

CS 345 - Project Six

About FAT-12

Clusters vs. Sectors

disk

A sector is a fixed division of a track on a disk. Sectors are fixed-size (typically 512 bytes on most media) and are the basic unit of access for any disk. A cluster is one or more contiguous sectors and the fundamental unit for DOS files. The DOS file system only works in terms of clusters. The FAT tables index “used” and “unused” clusters.

For FAT-12, a sector and a cluster are the same size, namely 512 bytes. The cluster size becomes the minimum file size for that system. The distinction between FAT-12, FAT-16, and FAT-32 is merely the size of one entry in the FAT table. FAT-12 systems contain 12 bits per entry. (FAT-32 systems have 32 bits per entry.) The FAT type is solely determined by the number of clusters on the media.

DOS disk layout

An MS-DOS floppy disk layout (FAT-12) consists of four major sections: the boot sector, FAT tables, root directory, and data area:

BS	FAT Tables																		Root Directory (14 sectors ´ 16 entries/sector = 224 entries)														File Clusters 2 - 2849
BS	FAT 1									FAT 2																							Data Area …
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32	33 - 2880

The boot sector consists of the first sector (sector 0) on the volume or disk (on disks that have multiple partitions, this will not always be sector 0). The boot sector contains specific information about the rest of organization of the file system, including how many copies of the FAT tables are present, how big a sector is, how many sectors in a cluster, etc.
FAT tables are pointers to every cluster on the disk, and indicate the number of the next cluster in the current cluster chain, the end of the cluster chain, whether a cluster is empty, or has errors. The FAT tables are the only method of finding the location of files and directories on the rest of the disk (thus, if the FAT tables become corrupted, the data on the disk is also essentially lost). There are typically two redundant copies of the FAT table on disk for data security and recovery purposes.
The Root Directory is the primary directory of the disk. Unlike other directories located in the data area of the disk, the root directory is restricted to a finite size of 224 entries (14 sectors * 16 directory entries per sector.) The starting cluster field in subdirectory directory entries that refer to the root directory will point to cluster 0.
Sector 33 is the first sector of the Data Area and corresponds to cluster 2 of the file system (the first cluster is always cluster 2). The data area contains file and directory data and spans the remaining sectors on the disk.

Programming Resources: Mapping disk data directly to structures

In many instances in Project 6, you may desire to copy a set amount of bytes from the RAM disk, and transfer them directly into a structure that has the appropriate fields already defined. For example, the boot sector of the disk contains raw data that can easily be transferred into an appropriate structure. Another example is one of the 32-byte directory entries in a sector. For memory transfers where you wish to align the bits inside of a byte into structured fields, a bit field is appropriate. However, for structures that are aligned on unsigned char, unsigned short, or unsigned long boundaries, you can simply copy the memory via memcpy into the appropriate structure.

There is one catch. In order to map the disk data to a structure you must make sure that the structure is byte aligned in memory. When a compiler allocates storage space for items in a structure, it may allocate extra space in between the item elements (padding). Disks don't do that. Two directives will force the compiler's alignment of your structures to be aligned in memory.

#pragma pack(push,1) // BYTE align in memory (no padding)

// data structure goes here…

#pragma pack(pop) // End of strict alignment

BSStruct bootSector; // The BYTE-aligned structure.
memcpy(&bootSector, &RAMDisk[0], sizeof(BSStruct))

The Boot Sector

The first sector on the volume or disk is called the boot sector and it contains information about the rest of the file system structure. You may assume that all disks and image files in this lab will conform to the FAT-12 standard. The boot sector contains the following values, which are aligned exactly as shown.

#pragma pack(push, 1) /* Byte align in memory (no padding) */

typedef struct

{ unsigned char BS_jmpBoot[3];    /* Jump instruction to the boot code */
   unsigned char BS_OEMName[8];    /* Name of system that formatted the volume */
   unsigned short BPB_BytsPerSec;   /* Bytes per sector (should be 512) */
   unsigned char BPB_SecPerClus;   /* Sectors per cluster (FAT-12 = 1) */
   unsigned short BPB_RsvdSecCnt;   /* Reserved sectors (FAT-12 = 1) */
   unsigned char BPB_NumFATs;      /* FAT tables on the disk (should be 2) */
   unsigned short BPB_RootEntCnt;   /* Max directory entries in root directory */
   unsigned short BPB_TotSec16;     /* FAT-12 total number of sectors on the disk */
   unsigned char BPB_Media;        /* Media type {fixed, removable, etc.} */
   unsigned short BPB_FATSz16;      /* Sector size of FAT table (FAT-12 = 9) */
   unsigned short BPB_SecPerTrk;    /* # of sectors per cylindrical track */
   unsigned short BPB_NumHeads;     /* # of heads per volume (1.4Mb 3.5" = 2) */
   unsigned long BPB_HiddSec;      /* # of preceding hidden sectors (0) */
   unsigned long BPB_TotSec32;     /* # of FAT-32 sectors (0 for FAT-12) */
   unsigned char BS_DrvNum;        /* A drive number for the media (OS specific) */
   unsigned char BS_Reserved1;     /* Reserved space for Windows NT (set to 0) */
   unsigned char BS_BootSig;       /* (0x29) Indicates following: */
   unsigned long BS_VolID;         /* Volume serial # (for tracking this disk) */
   unsigned char BS_VolLab[11];    /* Volume label (RDL or "NO NAME    ") */
   unsigned char BS_FilSysType[8]; /* Deceptive FAT type Label */

} BSStruct;

#pragma pack(pop) /* End strict alignment */

Organization of the file allocation tables (FAT)

fat The FAT-12 tables are organized as a linear array of FAT entries. A FAT entry is a single 12 bit value whose index matches the associated cluster in the data area of the disk. Because the first cluster in the data area of the disk corresponds to cluster 2, the first two FAT entries in the FAT table are reserved. You can think of the FAT table as a large array that contains pointers into the data area of the disk. It is also a linked-list of sorts, because you retrieve files that are longer than one cluster by referencing the FAT table, which points you to the next cluster of that file or directory.

A 12-bit FAT entry may have the following values

0x000 Unused cluster.
0xFF0-0xFF6 Reserved cluster (like FAT entries 0 and 1).
0xFF7 Bad Cluster (contains errors and should not be used).
0xFF8-0xFFF End of file/directory, also called EOC (end of cluster chain).
other numbers are pointers (indexes) to the next cluster in the file/directory.

Cluster chains are a series of FAT entries that point to the next cluster in the file/directory, and are terminated by an EOC indicator, exactly like linked-lists. For example, in the graphic above, the cluster chain for File1.txt is retrieved by starting at the first cluster indicated in the directory for that file (the bottom box in the above graphic) and following the pointers in the FAT table until you encounter the EOC terminator. The chain would comprise clusters 2, 4, 6, and 7. The cluster chain for MyDir is 3, 5. You know that you've reached the end of a file or directory listing when you check the corresponding FAT entry for the current cluster, and find that it contains the EOC flag.

When you are modifying and updating the FAT table, make sure that you update both copies of the FAT table! Command chkdsk should validate a disk volume by comparing the FAT tables to make sure that they are identical. Any discrepancies will cause a chkdsk error.

There are 3072 FAT entries in each FAT table (512 bytes per sector * 9 sectors = 4608 bytes. 4608 bytes / 1.5 bytes per FAT entry = 3072 FAT entries). Be aware that this is a few more entries than the disk has clusters; therefore, maximum capacity of the disk should be determined by the total number of clusters, not the number of possible FAT table entries.

Programming Resources: Tracking free space

In the MS-DOS FAT-12 file system, there is no structure or system that keeps track of how much free space is available on the disk. Thus, in order to manage this information yourself, you have several choices.

Sum up the number of free clusters on the disk when it is first mounted, place that value in some variable, and then just keep track of every new FAT table entry allocated and deleted throughout the life of your program (assuming that you are the only one that is accessing the drive, you'll be fine).
Create a free-space bitmap, similar to the one explained in Stallings, chapter 12, page 549, that associates a bit to each cluster in the FAT table.
Implement your own creative method!

File/Directory Masks

A file name mask is used to selectively qualify directory entries. A wildcard is a special character that can fill in for missing letter(s) in file and directory names when used to add specificity to OS commands. Although there are various ways to define the syntax and semantics of a mask string, we will use the following rules:

A mask has two parts, one for the file name and the other for the extension. If an extension mask is included, it is separated from the file mask by a period.
A null file mask string qualifies all entries (regardless of extension.)
An ‘*’ qualifies all characters to the right and cannot be followed by any characters (other than a period and an extension mask.) There can be only one “*” per file name and/or extension mask. Thus a single “*” would select all files without an extension. A “*.*” mask would be the same as a null string and select all files with or without an extension.
A ‘?’ is a wild card for any single character and requires the presence of a character in the file name to qualify. It may be used multiple times in the file name and/or file extension mask.

Examples: *.* All files

B* All files beginning with ‘B’ and no extension

???*.C All files of length at least 3 with ‘C’ extension

*.TXT All files with “TXT” extension

FAT-12 file name and extension representation

File names in DOS traditionally have a limit of 8 characters for the name, and 3 characters for the extension. (Modern FAT file systems allow for longer file names. See Microsoft's White Pages--the good part for details.) In your File Management System, you will need to translate the file names that you are given, into a form that you can use to search directory entries. Here are a few things to be aware of:

File/directory names and extensions are not null-terminated within the directory entry
File/directory names always occupy 8 bytes--if the file/directory name is shorter than 8 bytes (characters), pad the remaining bytes with spaces (0x20). This also applies to 3-character extensions.
File/directory names and extensions are always uppercase. Always convert given file/directory names to uppercase.
Directory names can have extensions too.
"FILE1" and "FILE1.TXT" are unique (the extension does matter).
Files and directories cannot have the same name (even though the attributes are different).
Do not accept file/directory names longer than 8 characters + 3 character extension, unless your FMS will handle long file names.

Here are examples of how some file names would translate into the 11 bytes allocated for the file/directory name and extension in the directory entry (white space between quotes should be considered as spaces).

filename [01234567012]

"foo.bar" Þ "FOO BAR"

"FOO.BAR" Þ "FOO BAR"

"Foo.Bar" Þ "FOO BAR"

"foo" Þ "FOO "

"foo." Þ "FOO "

"PICKLE.A" Þ "PICKLE A "

"prettybg.big" Þ "PRETTYBGBIG"

".big" Þ illegal! file/directory names cannot begin with a "."

Directory structures and their fields

A directory entry contains all of the information necessary to reference the contents of a file or directory on the volume, including additional information about the file/directory's attributes and other information including: time of creation/modification, date created, and size of the file (in bytes). All of this information is packed into a small 32-byte structure. Directories with more than 16 entries require more than one sector (maximum entries per sector = 512 bytes per sector / 32 bytes per directory entry = 16 directory entries).

Directories in the data area of the disk can be of any length (number of sectors). However, because the root directory is of fixed size, only a set number of directory entries will fit.

Subdirectories, directories other than the root directory, also contain two directory entries by default. These are the dot [.] and dotdot [..] entries--you have probably seen these before. These entries are simply pointers, the first [dot] points to the current directory, kind of like the "this" pointer in C++. The second entry [dotdot] is a pointer to the parent directory. This is the directory that you are specifying with the command "CD ..". A directory that includes these two entries is still considered empty.

A directory entry contains the following fields:

#pragma pack(push, 1) /* Byte align in memory (no padding) */

typedef struct

{ unsigned char Name[8];       /* File name (capital letters, padded w/spaces) */
   unsigned char Extension[3]; /* Extension (same format as name, no '.') */
   unsigned char Attributes;    /* Holds the attributes code */
   unsigned char Reserved[10]; /* Reserved for Windows NT (Set to zero!) */
   unsigned short Time;          /* Time of last write */
   unsigned short Date;          /* Date of last write */
   unsigned short startCluster; /* Pointer to the first cluster of the file */
   unsigned long fileSize;      /* File size in bytes */

} DirEntry;

#pragma pack(pop) /* End strict alignment */

The file/directory's attributes can have the following values. Consider using #define for these within your FMS.

#define N_FILE 0x00

#define READ_ONLY 0x01

#define HIDDEN 0x02

#define SYSTEM 0x04

#define VOLUME 0x08 // this is the volume label entry

#define DIRECTORY 0x10

#define ARCHIVE 0x20 // same as file

For example, a file with read-only, hidden, system, and archive attributes set might be created by using the bitwise OR operator.

unsigned short attributes = 0x00;
attributes = READ_ONLY | HIDDEN | SYSTEM | ARCHIVE; // 0x27

The directory and archive attributes are mutually exclusive, that is, a file cannot be both a directory and archived at the same time. The only way to distinguish a file from a directory is by checking the attribute.

For subdirectories that refer back to the root directory, the startCluster value will be 0. Note that cluster 0 is reserved in the FAT table--attempting to look up that value should result in an error. You must provide a check that will re-direct a startCluster of zero to the root directory.

When creating directory entries for new files that have nothing in them (a fileSize of 0), the startCluster should always be set to 0; otherwise chkdsk should report an error. Just remember to change the startCluster to the correct cluster after you have written to the file.

How DOS deletes files/directories

DOS does not immediately erase a file or directory when it is deleted. In fact, it does nothing to the clusters that contain the information (this is why it is sometimes possible to un-delete something). However, DOS does zero out the file/directory's cluster chain from the FAT table and places a special character (0xe5) in the first byte of the directory entry signaling that this entry has been deleted. You will need to do the same when a file or directory is deleted. Start with the cluster indicated in the directory entry, traverse the cluster chain, and set each FAT entry to zero including the EOC entry. When reading directory entries, ignore all entries that begin with 0xe5.

Programming Resources: Retrieving 12-bit entries from the FAT table

The FAT-12 table consists of an array of 12-bit entries that correspond to clusters in the data area of the disk. Retrieving a single FAT entry is slightly complicated because we cannot easily create a data structure of 12-bit or 1½ unsigned char entries. You may store the FAT table any way you wish, but the following code sample assumes that you've chosen to store the FAT table in an array of unsigned chars. Because we chose this, we will need to grab an entire unsigned short that covers the 12-bit FAT entry, with four bits to spare, and then extract just the 12 bits that we want (either the high-order or low-order bits, depending on whether the original FAT index was even or odd). The functions for retrieving and setting a FAT entry are provided below.

GetFatEntry. Index into an unsigned char FAT table and return an unsigned short containing the 12-bit value at that location.

// Return an unsigned short containing the 12-bit FAT entry code at FATindex

unsigned short GetFatEntry(int FATindex)
{

unsigned short FATEntryCode; // The return value
int FatOffset = ((FATindex * 3) / 2); // Calculate the offset

if (FATindex % 2 == 1) // If the index is odd
{ // Pull out a unsigned short from a unsigned char array

FATEntryCode = *((unsigned short *) &FAT[FatOffset]);
FATEntryCode >>= 4; // Extract the high-order 12 bits

}
else // If the index is even
{ // Pull out a unsigned short from a unsigned char array

FATEntryCode = *((unsigned short *) &FAT[FatOffset]);

FATEntryCode &= 0x0fff; // Extract the low-order 12 bits

}
return FATEntryCode;

} // End GetFatEntry

SetFatEntry. Insert an unsigned short the 12-bit value FAT12ClusEntryVal into an unsigned char FAT array at index FATindex. This function only writes the FAT12ClusEntryVal to one of the FAT tables! Make sure that you update both FAT tables on disk.

void SetFatEntry(int FATindex, unsigned short FAT12ClusEntryVal)
{

int FATOffset = ((FATindex * 3) / 2); // Calculate the offset
int FATData = *((unsigned short*)&FAT[FATOffset]);

if (FATindex % 2 == 0) // If the index is even
{ FAT12ClusEntryVal &= 0x0FFF; // mask to 12 bits

FATData &= 0xF000; // mask complement

}
else // Index is odd
{ FAT12ClusEntryVal <<= 4; // move 12-bits high

FATData &= 0x000F; // mask complement

}

// Update FAT entry value in the FAT table
*((unsigned short *)&FAT[FATOffset]) = FATData | FAT12ClusEntryVal;

} // End SetFatEntry

Programming Resources: Formatting and printing directory entries

According to the MS-DOS white pages, a directory entry whose first byte (the first byte of the file name) starts with 0xe5 is considered a deleted entry, and should not be printed. Also, an entry whose first byte starts with 0x00 indicates that this directory entry is empty and that none of the directory entries following that entry are valid.

The following function example takes a directory entry structure, formats a directory listing, and prints the resulting string. All these structures need to be BYTE aligned with #pragma directive.

Bit Field structure that will receive the time:

#pragma pack(push,1) // BYTE align in memory (no padding)

typedef struct

{ // (total 16 bits--a unsigned short)

unsigned short sec: 5; // low-order 5 bits are the seconds

unsigned short min: 6; // next 6 bits are the minutes

unsigned short hour: 5; // high-order 5 bits are the hour

} FATTime;

#pragma pack(pop) // End of strict alignment

Bit Field structure that will receive the date:

#pragma pack(push,1) // BYTE align in memory (no padding)

typedef struct

{ // (total 16 bits--a unsigned short)

unsigned short day: 5; // low-order 5 bits are the day

unsigned short month: 4; // next 4 bits are the month

unsigned short year: 7; // high-order 7 bits are the year

} FATDate;

#pragma pack(pop) // End of strict alignment

PrintDirectoryEntry function takes a directory entry dirent, formats it, and prints the resulting directory entry string.

void PrintDirectoryEntry(DirEntry dirent)

{

int i = 7;

char p_string[64] = " ----- 00:00:00 03/01/2004";

char tempstring[10];

FATDate date; // The Date bit field structure

FATTime time; // The Time bit field structure

strncpy(p_string, (char*)&dirent.Name, 8); // Copies 8 bytes from the name

while (p_string[i] == ' ') i--;

p_string[i+1] = '.'; // Add extension

strncpy(&p_string[i+2], (char*)&dirent.Extension, 3);

while (p_string[i+4] == ' ') i--;

if (p_string[i+4] == '.') p_string[i+4] = ' ';

// Generate the attributes

if(dirent.Attributes &0x01) p_string[14] = 'R';

if(dirent.Attributes &0x02) p_string[15] = 'H';

if(dirent.Attributes &0x04) p_string[16] = 'S';

if(dirent.Attributes &0x08) p_string[17] = 'V';

if(dirent.Attributes &0x10) p_string[18] = 'D';

if(dirent.Attributes &0x20) p_string[19] = 'A';

// Extract the time and format it into the string

memcpy(&time, &dirent.Time, sizeof(FATTime));

sprintf(&p_string[21], "%02d:%02d:%02d", time.hour, time.min, time.sec*2);

// Extract the date and format it into the string

memcpy(&date, &dirent.Date, sizeof(FATDate));

sprintf(p_string, "%s %02d/%02d/%04d %d %d", p_string,

date.month, date.day, date.year+1980,

dirent.startCluster, dirent.fileSize);

printf("\n%s", p_string); // p_string is now ready!

return;

} // end PrintDirectoryEntry

Programming Resources: Formatting and printing the FAT

The PrintFatTable function shows one method of formatting and printing the FAT table for easy viewing. This function organizes the FAT table into 10 columns. It relies on the helper function GetFatEntry that retrieves a FAT table entry when given an index into the FAT table.

RetrieveFatTable prints the first n FAT entries in FAT table.

void PrintFatTable(int n)

{

char tbuf[16];

char fbuf[100];

unsigned short fatvalue;

int size = (512 * 9) / 1.5; // Max fat entries in the FAT table

if (n < size) size = n;

for (int i=0; i<size; i++)

{ if ((i%10) == 0)

{ if (i) printf("%s", fbuf);

sprintf(fbuf, "\n %6d:", i);

}

fatvalue = GetFatEntry(i);

// Special formatting cases...

if (i < 2) sprintf(tbuf, " RSRV"); // A reserved cluster

else if (fatvalue == 4095) sprintf(tbuf, " EOC"); // The EOC marker

else if (fatvalue == 4087) sprintf(tbuf, " BAD"); // The BAD cluster marker

else sprintf(tbuf,"%5d", fatvalue); "); // Output fat value

strcat(fbuf, tbuf);

}

return;

} // End PrintFatTable

Programming Time

#include <stdio.h>

#include <time.h>

#pragma pack(push,1) // BYTE align in memory (no padding)

typedef struct

{ // (total 16 bits--a unsigned short)

unsigned short sec: 5; // low-order 5 bits are the seconds

unsigned short min: 6; // next 6 bits are the minutes

unsigned short hour: 5; // high-order 5 bits are the hour

} FATTime;

#pragma pack(pop) // End of strict alignment

#pragma pack(push,1) // BYTE align in memory (no padding)

typedef struct

{ // (total 16 bits--a unsigned short)

unsigned short day: 5; // low-order 5 bits are the day

unsigned short month: 4; // next 4 bits are the month

unsigned short year: 7; // high-order 7 bits are the year

} FATDate;

#pragma pack(pop) // End of strict alignment

#pragma pack(push,1) // BYTE align in memory (no padding)

typedef struct

{ unsigned char name[8]; // File name

unsigned char extension[3]; // Extension

unsigned char attributes; // Holds the attributes code

unsigned char reserved[10]; // Reserved

// unsigned short time; // Time of last write

// unsigned short date; // Date of last write

FATTime time; // Time of last write

FATDate date; // Date of last write

unsigned short startCluster; // Pointer to the first cluster of the file.

unsigned long fileSize; // File size in bytes

} DirEntry;

#pragma pack(pop) // End of strict alignment

int main()

{

DirEntry dirEntry;

time_t a;

struct tm *b;

FATDate *d;

FATTime *t;

// capture local time and date

time(&a);

b = localtime(&a); // get local time

d = (FATDate*)&(dirEntry.date); // point to date w/in dir entry

d->year = b->tm_year + 1900 - 1980; // update year

d->month = b->tm_mon; // update month

d->day = b->tm_mday; // update day

t = (FATTime*)&(dirEntry.time); // point to time w/in dir entry

t->hour = b->tm_hour; // update hour

t->min = b->tm_min; // update minute

t->sec = b->tm_sec; // update second

// convert back

struct tm xx, *x = &xx;

x->tm_wday = 0;

x->tm_yday = 0;

x->tm_isdst = 0;

d = (FATDate*)&(dirEntry.date); // point to date w/in dir entry

x->tm_year = d->year + 1980 - 1900; // update year

x->tm_mon = d->month; // update month

x->tm_mday = d->day; // update day

t = (FATTime*)&(dirEntry.time); // point to time w/in dir entry

x->tm_hour = t->hour; // update hour

x->tm_min = t->min; // update minute

x->tm_sec = t->sec; // update second

char buf[100];

int size;

size = strftime(buf, 99, "%A, %B %d, %Y %H:%M:%S", x);

printf("\nDate: %s", buf);

printf("\nDate: %s", asctime(x));

return 0;

}

Programming Extras: Microsoft's White Pages

Microsoft's Hardware White Pages document 34 pages of everything you ever wanted (or didn’t want) to know about FAT. Use this document to learn more about the boot sector, how to determine the type of FAT file system, and long file and directory names.